-
Notifications
You must be signed in to change notification settings - Fork 753
[SmartSwitch] add graceful shutdown/startup utilities and visibility #4113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors module state transition tracking by moving the implementation from STATE_DB to platform-level methods. The change introduces three new methods in ModuleHelper to manage state transitions through the platform API instead of using database entries with timestamps.
- Adds
set_module_state_transition,clear_module_state_transition, andget_module_state_transitionmethods toModuleHelper - Removes STATE_DB-based transition tracking functions from
config/chassis_modules.py - Updates shell script to use new platform-level transition flag management
Reviewed Changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 7 comments.
Show a summary per file
| File | Description |
|---|---|
| utilities_common/module.py | Adds three new methods for managing module state transitions via platform API |
| tests/test_module.py | Adds comprehensive unit tests for the new state transition methods |
| tests/chassis_modules_test.py | Updates tests to use platform-level transition checks instead of STATE_DB queries |
| scripts/reboot_smartswitch_helper | Adds shell functions to set/clear/get state transition flags via Python API |
| config/chassis_modules.py | Removes STATE_DB transition tracking functions and simplifies shutdown/startup logic |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@rameshraghupathy @gpunathilell could you please review this latest PR |
@rameshraghupathy Fixed the PR description. Thank you |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
HLD: https://github.com/sonic-net/SONiC/blob/master/doc/smart-switch/graceful-shutdown/graceful-shutdown.md
These changes build upon enhancements in #4031
This PR adds CLI support and visibility for module-level graceful transitions (startup/shutdown/reboot) to align with the SmartSwitch/DPU lifecycle work.
What I did
How I did it
How to verify it
redis-cli -n 6 hgetall "CHASSIS_MODULE_TABLE|DPU0"Sample outputs when "state_transition_in_progress"
Errors thrown when the same module transition is already in progress.
$ sudo config chassis modules shutdown DPU2;redis-cli -n 6 hgetall 'CHASSIS_MODULE_TABLE|DPU2';sudo reboot -d DPU2;redis-cli -n 6 hgetall 'CHASSIS_MODULE_TABLE|DPU2'
Shutting down chassis module DPU2
True
2025-11-13 18:43:22 - User requested rebooting device dpu2 ...
2025-11-13 18:43:23 - INFO: DPU dpu2 is in 'Online' state before reboot.
2025-11-13 18:43:23 - ERROR: state_transition_in_progress flag is already set for dpu2
Previous command output (if the output of a command-line utility has changed)
New command output (if the output of a command-line utility has changed)
$ reboot -d DPU1
True
2025-11-17 17:56:10 - User requested rebooting device dpu1 ...
2025-11-17 17:56:11 - INFO: DPU dpu1 is in 'Online' state before reboot.
2025-11-17 17:56:12 - INFO: Rebooting dpu1, ip:1X9.XXX.X00.2 gnmi_port:50XXX
2025-11-17 17:56:53 - INFO: dpu1 halted the services successfully
2025-11-17 17:58:50 - INFO: Rebooting dpu1 with reboot_type:DPU...